Audio frequency response processing system

ABSTRACT

The invention provides a method and system for forming an output impulse response function. The method includes the steps of creating an initial impulse response, and dividing the impulse response into a head portion and a tail portion. The tail portion is high pass filtered, and low frequency components of the head portion are boosted. The low frequency boosted and high pass filtered respective head and tail portions are then combined into a modified output impulse response, which can then be used to spatialize an audio signal by convolving it.

FIELD OF THE INVENTION

This present invention relates to the field of audio signal processingand, in particular, to the field of simulating impulse responsefunctions so as to provide for spatialization of audio signals.

BACKGROUND OF THE INVENTION

The human auditory system has evolved accurately to locate sounds thatoccur within the environment of the listener. The accuracy is thought tobe derived primarily from two calculations carried out by the brain. Thefirst is an analysis of the initial sound arrival and arrival of nearreflections (the direct sound or head portion of the sound) whichnormally help to locate a sound; the second is an analysis of thereverberant tail portion of a sound which helps to provide an“environmental feel” to the sound. Of course, subtle differences betweenthe sounds received at each ear are also highly relevant, especiallyupon the receipt of the direct sound and early reflections.

For example, in FIG. 1, there is illustrated a speaker 1 and listener 2in a room environment. Taking the case of a single ear 3, the listener 2receives a direct sound 4 from the speaker and a number of reflections5, 6, and 7. It will be noted that the arrangement of FIG. 1 essentiallyshows a two dimensional sectional view and reflections off the floors orthe ceilings are not shown. Further, the audio signal to only one ear isillustrated.

Often it is desirable to simulate the natural process of sound around alistener. For example, the listener, listening to a set of headphones,can be provided with an “out of head” experience of sounds appearing toemanate from an external environment. This can be achieved through theknown process of determining an impulse response function for each earfor each sound and convolving the impulse response functions with acorresponding audio signal so as to produce the environmental effect oflocating the sound in the external environment.

SUMMARY OF THE INVENTION

According to a first aspect of the invention there is provided:

(a) a method of forming an output impulse response function comprisingthe steps of creating an initial impulse response having a head portionand a tail portion,

(b) high pass filtering at least part of said tail portion to form ahigh pass filtered tail portion, and

(c) combining said high pass filtered tail portion with said headportion to form an output impulse response.

Preferably, the method includes the step of boosting low frequencycomponents of said head portion of said initial impulse response priorto step (c).

Advantageously, the method includes the step of dividing the initialimpulse response into the head and tail portions.

Conveniently, the method further comprises the step of utilising saidoutput impulse response in addition to other impulse responses tovirtually spatialize an audio signal around a listener.

The invention extends to an apparatus for forming an output impulseresponse function comprising:

-   -   (a) dividing means for dividing an initial impulse response into        a head portion and a tail portion;    -   (b) high pass filtering means for high pass filtering at least        part of the tail portion to form a high pass filtered tail        portion;    -   (c) combining means for combining said high pass filtered tail        portion with said head portion to form an output impulse        response.    -   The invention further extends to an audio processing system for        spatializing an audio signal, said system comprising:    -   an input means for inputting said audio signal;    -   convolution means connected to said input means, for convolving        said audio signal with at least one impulse response function,        said impulse response function having a head component and a        high pass filtered tail component.

The invention still further contemplates a method of processing an audioinput signal comprising the steps of:

-   -   (a) dividing an audio input signal into first and second        streams;    -   (b) high pass filtering the second stream of the audio input        signal;    -   (c) applying a reverberant tail to the second stream of the        audio input signal; and    -   (d) combining the audio input signal from first stream and the        high pass filtered reverberated audio signal from the second        stream.

The method may include the step of boosting low frequency components ofthe audio input signal of the first stream.

The invention still further provides a method of processing an audioinput signal comprising the steps of:

-   -   (a) streaming the audio input signal into at least first and        second streams;    -   (b) providing at least one high pass filtered tail impulse        response signal;    -   (c) convolving the first stream of the audio input with the high        pass filtered tail impulse response signal;    -   (d) providing at least one head impulse response signal;    -   (e) convolving the second stream of the audio input with the        head impulse response signal; and    -   (f) combining the convolved outputs to provide a spatialized        audio signal.

Typically, the method includes the steps of boosting the low frequencycomponent of the second stream to compensate for the reduction in lowfrequency components of the first stream.

The method typically includes the further steps of measuring thereduction in low frequency components from the high pass filtered tailimpulse response, and using the measurement to derive a compensationfactor which is ultimately applied to the second stream.

Conveniently, the method includes the steps of streaming the audio inputsignal into a third stream, adjusting the gain of the signal using thecompensation factor, low pass filtering the adjusted signal, andcombining the low pass filtered adjusted signal with the second stream,for subsequent convolving with the head impulse response signal.

The invention still further provides a method of spatializing an audiosignal comprising the steps of:

-   -   (a) providing a head portion of an impulse response signal;    -   (b) providing a tail portion of an impulse response signal;    -   (c) high pass filtering the tail portion;    -   (d) convolving the high pass filtered tail portion with the        audio signal;    -   (e) convolving the head portion with the audio signal; and    -   (f) combining the convolved signals to provide a spatialized        output signal.

BRIEF DESCRIPTION OF THE DRAWINGS

Notwithstanding any other forms which may fall in the scope of thepresent invention, the preferred forms of the invention will now bedescribed by way of the example only with reference to the accompanyingdrawings in which;

FIG. 1 illustrates schematically the process of projection of a sound toa listener in a room environment;

FIG. 2 illustrates a typical impulse response of a room;

FIG. 3 illustrates in detail the first 20 ms of this typical response;

FIG. 4 illustrates a flowchart of a method and system of a firstembodiment of the invention;

FIG. 5 illustrates flowchart-style part of a stereo audio signalprocessing arrangement;

FIG. 6 illustrates a flowchart of a method and system of a secondembodiment applied to the arrangement of FIG. 5; and

FIG. 7 shows a third embodiment of an audio processing system of theinvention.

DETAILED DESCRIPTION OF THE EMBODIMENTS

Research by the present inventor into the nature of measured impulseresponse functions has lead to various unexpected discoveries which canbe utilised to advantageous effect in reducing the computationalcomplexity of the convolution process in audio spatialization. Fromvarious measurements made by the present inventor of human listeners toaudio spatialization systems the following important factors have beenuncovered.

First, the low frequency components in the tail of an impulse responsedo not contribute to the sense of an enveloping acoustic space.Generally, this sense of “space” is created by the high frequency(greater than around 300 Hz) portion of the reverberant tail of the roomimpulse response.

Secondly, the low-frequency part of the tail of the reverberant responseis often the cause of undesirable ‘resonance’ effects, particularly ifthe reverberant room response includes the modal resonances that arepresent in almost all rooms. This is often perceived by the listener as“bad equalisation”.

In FIG. 2 there is shown an example of an impulse response function 14from a sound source in a room environment similar to that of FIG. 1. Theresponse function includes a direct sound or head portion 15 and a tailportion 16. The tail portion 16 includes substantial low frequencycomponents that do not provide significant directional information.Typically, the head portion occupies only the first two to threemilliseconds of the total impulse response, and (as in the example ofFIG. 3), the head portion is often separated from the tail by a shortsegment of zero signal 17. It will be appreciated that the head portionincludes direct sound (i.e. the first sound arrival 15A), but may alsoinclude initial closely following indirect sound (say floor and closewall direct echoes 15A to 15E). Although head and tail portions cannotalways strictly be distinguished solely on a time basis, in practice,the head portion will seldom take up more than the first fivemilliseconds. The differences in amplitude also serve to distinguishbetween the two portions, with the tail portion essentially beingrepresentative of lower amplitude reverberations.

The preferred embodiment relies upon a substantial reduction in thecomplexity of the impulse response function through the removal of thelow frequency components (say below 300 Hz) from the tail. Hence, in thepreferred embodiment, the impulse response function to be utilised ismanipulated in a predetermined manner. An example of the flowchart ofthe manipulation process is illustrated at 20 in FIG. 4. The initialimpulse response 21 is divided into a direct sound portion 22 and a tailportion 23. The tail portion is high pass filtered 24 at frequenciesabove 300 Hz whilst the direct sound portion is optionally boosted atlow frequencies 25 substantially below 300 Hz. The two impulse responsefragments are combined at 26 before being output at 27. The outputresponse can then be utilised in any subsequent downstream audioprocessing system. For example, the impulse response can then becombined with other impulse responses as described in PCT PatentApplication No. PCT/AU99/00002 entitled “Audio Signal Processing Methodand Apparatus”, assigned to the present applicant, the contents of whichare hereby incorporated specifically by cross reference. It will beappreciated that, in the time domain, the combined signal 28 will notlook appreciably different from the original one, in that the visualeffect of boosting and removal of the below 300 Hz components from therespective head and tail portions will not be substantial. However, theaudible effect is significantly more marked. It will be appreciated that300 Hz is an exemplary figure. In the case where, say, larger roomspaces are being mimicked, frequencies of 200 Hz or less may be utilizedin both the low and high pass filters.

Other forms of audio processing environments utilising the invention arealso possible. For example, in FIG. 5, an audio input signal 30 is shownbeing split into respective direct and indirect paths 30.1 and 30.2. Thedirect path 30.1 is split again into left and right paths which undergogain adjusting at 34.L and 34.R before being summed at 35.L and 35.Rrespectively. The second channel 30.2 undergoes processing by means of astereo reverberation filter 32, the outputs of which are similarlysummed at 35.L and 35.R to provide left and right stereo channels.

In FIG. 6, the audio input signal 30 is shown being split in first andsecond channels 30.1 and 30.2, with the second channel 30.2 being highpass filtered at 31 by means of a high pass filter 34 prior to beingprocessed by the stereo reverberation filter 32. The audio input signalof the first channel 30.1 is provided with a low frequency boost at 33,which has the effect of boosting the low frequency components of thesignal, before being split into left and right inputs which are gainadjusted at 34L and 34R respectively, prior to being added at 35.L and35.R to the output from the stereo reverberation filter 32, whicheffectively adds a “tail” to the high pass filtered audio signal outputat 31. It will be appreciated that the high pass filter 31 and thereverberation filter 32 may be reversed in order. Alternatively, thehigh pass filter or a series of such filters may be built into thereverberation filter, which may be adapted to employ a “longconvolution” reverberation procedure.

Referring now to FIG. 7, a further embodiment of an audio processingsystem 50 of the invention is shown which combines features of both thefirst and second embodiments. A database of binaural tail impulseresponses in respect of rooms having different acoustic qualities 51 ispassed through a high pass filter 52 which effectively removes the lowfrequency portions of the tail impulse responses. The extent of thefrequency removal in respect of each tail impulse is measured,normalised and stored in a low frequency compensation database 53. Atthe same time, the corresponding modified impulse responses are storedin database 54. The low frequency compensation database thus provides,in respect of each modified impulse response, a compensation factortypically inversely proportional to the percentage of remaining lowfrequencies, which can then be used in the manner described below tocompensate for the reduction in low frequency components of the signalas a whole. The modified tail impulses from the modified impulseresponse database are selectively fed to a stereo reverberation FIR(finite impulse response) filter 55.

An audio input 56 is streamed into three channels, with a first channel56.1 being input into the stereo reverberation filter 55, and a secondchannel 56.2 being input into a low pass filter 57 via a multiplier 58.The gain of the multiplier 58 and the resultant gain of the low passfilter is determined by the compensation factor retrieved from the lowfrequency compensation database 53 in respect of the correspondingmodified impulse responses stored in the database 54.

A third channel 56.3 is input to a summer 59 via an adjustable gainamplifier 60. The summer 59 sums the inputs from the independentlyadjustable gain amplifier 60 and from the output of the low pass filter57. The summed output is fed through a pair of HRTF left and rightfilters 61.L and 61.R. A database of HRTF's or head impulse responseportions 62 has inputs leading to the filters 61.L and 61.R. SelectedHRTF's from the database 62 are convolved in the HRTF filters with thesummed input signals so as to provide spatialized outputs to the leftand right summers 63.L and 63.R, which also receive spatialized outputsfrom the stereo reverberation filter 55. Binaural spatialized outputsignals 65.L and 65.R are output from the respective summers 63.L and63.R. Effectively, the audio input signal 56 is thus spatialised usingtail and head portions of impulse responses which are modified in themanner described above. The removal of low frequency components from thetail impulse responses is compensated for at multiplier 58 by theproportional increase in low frequency components to the head or HRTFportion of the impulse response signal. Effectively, the overallproportion of low frequency components in the spatialized sound thusremains approximately the same, and is effectively shifted in the abovedescribed process from the tail portions to the head portions of thespatializing impulse responses.

The filtering of the low frequency components in the arrangements ofFIGS. 4, 6 and 7 has a number of advantages in addition to thesimplification of the processing of the tail portion of the impulseresponse. These advantages include the elimination of possible resonantmodes when the impulse response of FIGS. 2 and 3 is convolved with aninput signal. Also, resonant modes in the reverberant filter typearrangements are also reduced, typically without changing the overall“feel” of the sound by keeping low frequency components relativelyconstant.

It will be appreciated to the person skilled in the art that numerousvariations and/or modifications may be made to the present invention hasshown the specific embodiments without departing from the spiritualscope of the inventions broadly described. The preferred embodimentsare, therefore, to be considered in all respects to be illustrative andnot restrictive.

1. A method of forming an output impulse response function comprisingthe steps of: (a) creating an initial impulse response having a headportion and a tail portion; (b) high pass filtering at least part ofsaid tail portion to form a high pass filtered tail portion; (c)combining said high pass filtered tail portion with said head portion toform an output impulse response.
 2. A method as claimed in claim 1 whichincludes the step of boosting low frequency components of said headportion of said initial impulse response prior to step (c).
 3. A methodas claimed in claim 2 which include the step of dividing the initialimpulse response into the head and tail portions.
 4. A method as claimedin claim 2 wherein said step of high pass filtering is arranged tosuppress frequencies below substantially 200 to 300 Hz.
 5. A method asclaimed in claim 2 which further comprises the steps of: utilising saidoutput impulse response in addition to other impulse responses tovirtually spatialize an audio signal around a listener; providing a tailportion of an impulse response signal; high pass filtering the tailportion; convolving the high pass filtered tail portion with the audiosignal; convolving the head portion with the audio signal; and combiningthe convolved signals to provide a spatialized output signal.
 6. Amethod as claimed in claim 1 which include the step of dividing theinitial impulse response into the head and tail portions.
 7. A method asclaimed in claim 6 wherein said step of high pass filtering is arrangedto suppress frequencies below substantially 200 to 300 Hz.
 8. A methodas claimed in claim 6 which further comprises the steps of: utilisingsaid output impulse response in addition to other impulse responses tovirtually spatialize an audio signal around a listener; providing a tailportion of an impulse response signal; high pass filtering the tailportion; convolving the high pass filtered tail portion with the audiosignal; convolving the head portion with the audio signal; and combiningthe convolved signals to provide a spatialized output signal.
 9. Amethod as claimed in claim 1 wherein said step of high pass filtering isarranged to suppress frequencies below substantially 200 to 300 Hz. 10.A method as claimed in claim 9 which further comprises the steps of:utilising said output impulse response in addition to other impulseresponses to virtually spatialize an audio signal around a listener;providing a tail portion of an impulse response signal; high passfiltering the tail portion; convolving the high pass filtered tailportion with the audio signal; convolving the head portion with theaudio signal; and combining the convolved signals to provide aspatialized output signal.
 11. A method as claimed in claim 1 whichfurther comprises the step of: utilising said output impulse response inaddition to other impulse responses to virtually spatialize an audiosignal around a listener.
 12. Apparatus for forming an output impulseresponse function comprising: dividing means for dividing an initialimpulse response into a head portion and a tail portion; high passfiltering means for high pass filtering at least part of the tailportion to form a high pass filtered tail portion; combining means forcombining said high pass filtered tail portion with said head portion toform an output impulse response.
 13. Apparatus as claimed in claim 12which includes boosting means for boosting low frequency components ofsaid head portion of said response.
 14. Apparatus as claimed in claim 13wherein said high pass filtering means is arranged to suppressfrequencies below substantially 200 to 300 Hz.
 15. Apparatus as claimedin claim 13 wherein said boosting means is arranged to boost lowfrequency components of said head portion of said initial response belowsubstantially 200 to 300 Hz.